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ABSTRACT 



Certain methods of analysis for testing the hypothesis 
that organized motion exists in the airflow above naturally 
occurring water waves are examined. In particular, two cur- 
rent methods, spectral analysis and joint probability density 
function analysis, are briefly discussed; two new methods, 
matched filters and parametric time series analysis, are 
suggested. 

An analysis is done with parametric time series analysis 
on wind-wave data. The results show the tractability of 
this data to parametric methods, and the results are in 
agreement with previous spectral analyses of the data in the 
regions of overlap. 
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I. INTRODUCTION 



In recent years, interest in parameterizing and defining 
processes in the air-sea boundary layer has intensified. Re- 
cent theoretical models [Miles, 1957] postulate, among other 
things, that the airflow near the surface and the waves are 
coupled in such a manner as to produce organized motion in 
the air in relation to the wave field. This results in air 
motion with the same period as the dominant period of the 
waves. This paper examines certain methods of analysis by 
which these hypotheses can be tested. In particular, two 
current methods are discussed, two new methods suggested, and 
an analysis using one of the new methods is performed. 
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ll. BACKGROUND 



If organized motion exists in the air due to the presence 
of the underlying waves, its existence and nature should be 
reflected in observations of the wind over the waves* Unfor- 
tunately for a prospective analyst of such observations, this 
regime is characterized by the presence of a general turbulent 
field which serves to mask the organized motion. The problem, 
then, is to discover whether organized motion does exist, and 
if it does, its nature, in the midst of random turbulent in- 
fluences. This should be determined from data obtained over 
natural waves. 

It is the usual case that data of interest take the form 
of simultaneous continuous time traces of variables of inter- 
est, like wind components and wave heights. For purposes of 
ease of analysis, such records are usually subsequently dis- 
cretized so that the final form is that of a multivariate 
discrete time series. This is the case with data used in 
this paper. That data was obtained from observations over 
natural waves on Lake Michigan; the data are fully described 
elsewhere [Davidson, 1970]. 

Time series differ from other types of statistical sample 
realizations in that the observations are assumed to be de- 
pendent upon one another, whereas in the larger body of sta- 
tistical technique, it is preferred that observations be 
independent. The exploitation of this expected dependence 
is a distinguishing characteristic of time series analysis. 
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As is suggested by the form of the data, the majority of the 
techniques considered here are based on time series analysis. 
One method, that of joint probability distribution function 
analysis, is not drawn from the body of time series analysis, 
but it also uses the dependent nature of the data to advan- 
tage, though indirectly. 

A. SPECTRAL ANALYSIS 

Most widely known of the time series analysis techniques 
is spectral analysis. This technique examines the data in 
the frequency domain. The most common representation of spec- 
tral results is the sample spectrum, whose magnitude for a 
given frequency reflects the amount of variance in the data 
which can be accounted for by the presence of a cosine compon- 
ent at the given frequency. It is, in fact, the Fourier 
cosine transform of the estimated autocovariance function. 
Hence, the spectrum of data arising from a process dominated 
by a periodic effect should exhibit a marked peak at the fre- 
quency of the periodic effect, while the spectrum of white 
noise should be flat. Thus noise should raise the level of 
the spectrum, but should not obscure significant peaks. Gen- 
erally, if the hypothesized relationship hold, one would ex- 
pect a peak in the wind component spectra at the same frequency 
as the major peak in the wave spectrum. 

Spectral analysis also has a natural extension’ into multi- 
variate space, called cross-spectral analysis. As the co- 
variance of wind and waves is of primary interest here, 
calculation of cross-spectra should reveal some information 
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of interest, including estimates of phase relationships, co- 
herence, and cross-coyariance with respect to frequency. Pre- 
dictions from theoretical models for these values may then be 
compared to the estimates resulting from actual observations 
as an indication of the validity of the theory. 

Spectral analysis is not without major drawbacks, however. 
A major handicap is that it can only deal adequately with 
stationary time series. Often natural time series are not 
observed to be stationary, and may require elaborate algo- 
rithms to be applied in an effort to remove trends and/or 
seasonality to obtain s ta t ionarity . A second difficulty is 
the lack of smoothness in actual sample spectra. A typical 
sample spectrum is so jagged that interpretation is difficult. 
The usual remedy is to smooth the data, smooth the spectrum, 
or both. This makes the information more readily apprehended, 
but the complex effect of such smoothing on individual esti- 
mates of spectral ordinates makes the statistical significance 
of the estimates difficult or impossible to determine. It 
may also result in a loss of information contained in the 
data. Further, while confidence intervals for unsmoothed 
estimates of spectral ordinates may be calculated theoretical- 
ly, theory also shows that the error in the estimate of the 
spectral ordinate will be of the same order of magnitude as 
the spectral ordinate itself jKendall and Stuart, 1966]. 

These facts make it difficult to quantitatively support the 
belief that a particular peak in a smoothed spectrum is 
significant. Thus Kendall and Stuart [1966] rightly caution 
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meteorologists and oceanographers by name about assuming the 
existence of periodic elements in the data generating process 
solely on the basis of spectral analysis; care is necessary. 
Further discussions of other possible pitfalls as well as much 
more complete discussions of spectral analysis are to be found 
in Blackman and Tukey 11958], Kendall and Stuart [1966], and 
Jenkins and Watts [1968], 

Although unsatisfying in some statistical senses, spectral 
analysis is a useful tool to be applied to looking for wind- 
wave coupling. The data used in this paper was extensively 
analyzed spectrally by Davidson [1970], with good results. 

B. MATCHED AND WIENER FILTERS 

Another type of analysis related to spectral analysis 
which may hold promise for the wind-wave coupling problem is 
to be found in the repertoire of electrical engineering. 
Electrical engineers are frequently faced with the problem 
of detecting and isolating a signal in the midst of noise, 
precisely analogous to looking for organized motion in the 
air in the midst of turbulence. To solve this problem, two 
types of spectrally derived filters have been developed, de- 
signed to give a large response to signal and a small res- 
ponse to noise. They are matched filters and Wiener filters. 
To the best knowledge of the author, no one has yet applied 
such filters to wind-wave data looking for organized motion, 
but Sokol I1971J applied this technique to detection of ther- 
mal plumes in the air adjacent to the sea from temperature 
traces of a type similar to the wind-wave data. His results 
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were quite encouraging for the hypothesis that such an analy- 
sis might be useful in the detection of organized motion in 
turbulence . 

At present, the major problem with the application of 
these techniques to wind-wave data appears to be sensitivity 
to small errors in estimates of certain parameters which are 
difficult to estimate accurately, like the phase of the per- 
iodic effect. However, the magnitude of such problems is as 
yet unknown, and does not preclude at least a comprehensive 
preliminary examination of these techniques for practicality 
with respect to the wind-wave problem. 

C. JOINT PROBABILITY DENSITY FUNCTION ANALYSIS 

The one analysis technique currently used in this field 
which is not properly a member of time series analysis is 
joint probability density function analysis. Basically, this 
method examines the data in the probability domain. In the 
analysis, the probability density function of the set of 
joint occurrences of two variables is calculated. Equal 
joint probability density contours are plotted on a two di- 
mensional array. The contours are then subjectively analyzed 
for deviations from the concentric circle pattern of a theo- 
retical random, independent joint probability density func- 
tion, the current reference function being the bivariate 
standard normal joint probability density function; The 
pattern of these deviations, which is assumed to arise be- 
cause of the dependent nature of the time series, is then 
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evaluated in terms of the generating process and compared to 
what is expected by theory. This type of analysis has been 
applied to meteorological data by Holland I1972J, Daytdson 
and Frank I1972J, and Frank I1971J. 

This method shows great promise; however, it lacks refine- 
ment as yet. The choice of the bivariate standard normal 
joint probability density function as a reference distribution 
is formally incorrect, although it may be satisfactory in 
cases involving many degrees of freedom. The proper choice 
is a bivariate distribution analogous to the Student-t distri- 
bution in the univariate case. Using such a distribution, 
described in some detail in Birnbaum [1962], it appears quite 
feasible to develop an objective test of the significance of 
deviations from the distribution expected if the generating 
processes were random and independent. Further tests against 
such a reference distribution reduce, effectively, to t-tests 
for correlation coefficients and regression coefficients cal- 
culated between the two variables [Birnbaum, 1962, pages 223- 
242]. It would appear beneficial to include at least these 
last tests as a normal part of the analysis. 

D. PARAMETRIC TIME SERIES ANALYSIS 

The final method is that advanced by Box and Jenkins 
[1970], called by some, parametric time series analysis. 

This method involves fitting a stochastic model to the time 
series, while allowing, in keeping with modern statistical 
thought, the data themselves to shape the model and the analy- 
sis itself to a relatively great extent. This method is 
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relatively new; its capabilities and limitations are not com- 
mon knowledge in the manner that those of spectral analysis 
are. In order to better understand these capabilities and 
limitations, particularly as they apply to the boundary layer 
data in this study, it is necessary to discuss the form of 
the models to be fitted. This is done in Sections 1 through 
3, which describe, though with considerably less detail,, the 
ideas expressed by the primary source. Box and Jenkins text 
[1970]. With this background already given, the particular 
applications to wind-wave coupling investigations are dis- 
cussed in Section E. 

1 . Yule^s Proposition ^ 

Probably no process generating experimental observa- 
tions is completely deterministic. Thus, deterministic 
models are not usually adequate in analyses, although physi- 
cists and others have successfully used them in analyses of 
data generated by an apparatus designed to be so accurate 
that noise contributions to the observations are negligible 
in relation to results of interest. But if the noise is of 
a significant magnitude, as with turbulence data, models in- 
cluding noise must be used. If the probability structure of 
the process is of interest, as in the case of atmospheric 
turbulence studies, the normal course is to use a stochastic 
model. The model fitted by parametric time series analysis 
is stochastic. 

Central to parametric time series analysis is Yule’s 
proposition [Yule, 1927] that a time series whose elements 
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are highly dependent, as is the case with wind-^wave records, 
may usefully be viewed as having arisen from the application 
of a linear filter to a parallel series of independent random 
shocks drawn from a fixed distribution. This viewpoint is 
purely heuristic, and need not correspond to the actual pro- 
cess at all to be useful. The proposition is symbolized in 
Figure la. 



In applying this idea, the fixed distribution of the 

series of random shocks will be taken to be normal with mean 

2 

zero and variance . With the random shock series repre- 

sented by a^’s, the observed time series by z^’s, and the 
linear filter elements by ’ s , the relationship can be 
symbolized as 






= y + (1 + + ...) 

= y + a^. (2.1) 

where B is the standard backward shift operator, and 

B = <2-2> 

If the sequence is finite, or infinite and convergent 

for B given a value of unity, the filter is said to be con- 
vergent. If the filter is convergent, then z(t) is a station- 
ary time series, and \i is the mean of the series. If the 
filter is unstable, ]i is an arbitrary reference point for the 
process level, and zft) is non-s ta t ipnary . 

2 . The ARIMA Model 

Parametric time series analysis is based on fitting 
a particularly useful subset of linear filters of the kind 
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a. Yule's Model 






b. ARIMA Model 



ARIMA 




c. Transfer Function Model 







Figure 1. Filter Models 

^ after Box & Jenkins [1970J 
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just described. This class of models is known as the autore- 
gressive integrated moving average CARIMA) class. As the name 
suggests, tliia model incorporates three separate features, 
the autoregressive terms, moving average terms, and integra- 
tion operator. The model is easier to understand if one dis- 
cusses each part separately, building the total structure as 
new features are discussed. 

The autoregressive model associates the current value 
of the process with a linear combination of past values of 
the process and a current random shock. An autoregressive 
process of order p, AR(p) , would be: 

+ a . (2.3) 

where = z ^ V* if one considers the past values of z^ as 

the '’independent" variables, then one has the standard linear 

regression relation, hence the name autoregressive. The p+2 

2 

parameters: <}) ^ , <}) 9 > • • • , 4 ^ » y > , must generally be estimated 

X z pa 

directly from the data. 

It is clear that by substituting back within the 
model (Equation 2.3) for the past values used by the model, 

e . g . 



^ - ♦i 5t-l *2 -t 



’t-1 * *p^t-p- 



+ a, 



p t-p -1 t -1 



(2.4) 



one can eventually produce a representation of the AR(p) pro- 
cess as an infinite series in the a^’s, and hence, as a 
linear filter of the type described previously. Writing the 
Ar(p) process as 

(})pCB)z^ = a^ C2.5) 
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and 



^CB)a^ - ij. (2.6) 

then one obtains 

= iJjCB) (2.7) 

with the same constraints for stationarity of ^(B) as noted 
previously . 

The second important process is the moving average 
process of order q, MA(q) which relates the current value of 
to past values of the a^’s. The MA(q) process may be re- 
presented as 






Writing the MA(q) process as: 



^t = > 



(2.9) 



it is immediately seen that this process already has the 
form of a linear filter where 



= 0^(B) , (2.10) 

In a formal statistical sense, the form described above is 

not confined to moving averages of the a^*s, since the 

need not be positive, nor sum to one, but this is standard 

nomenclature nonetheless. The q+2 parameters of the model: 

2 

0- , 0^,..., 0 , y, a , again generally depend directly on 

^ q a 

the data for their estimation. 

These two processes may be mixed to achieve parsimony 
of the model Cdiscussed in a later section). The result is 
an autoregressive moving average process of order (p,q) that 
is , ARMA (p , q) : 
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t ^1 t-1 ^p t-p t 1 t"l q t-q (2.11) 



or 



%CB) - e 



( 2 . 12 ) 



Here a total of p+q+2 parameters: (f)^,..., c|)^, 0^,..., 9^, y , 

2 

0 ) are required, and in general must all be estimated 

a 

directly from the data. If q is finite, 6^(B) is a stable 
filter. For the equivalent filter of the ARMA (p,q) process 
to be stable, then, it is sufficient to require <1)^ ^(B) to 
be stable, as the ARMA (p,q) equivalent filter is a simple 
infinite series: 

(2.13) 



'li(B) = (B) e^(B) 

Further, it is clear that the ARMA (o,q) model is equivalent 
to the MA(q) , and the ARMA (p,o) to the AR(p) when all are 
applied to the same z^’s and a^’s. 

For reasons of simplicity and stability of calcula- 
tion, it is useful to require that the only ARMA(p,q) proces- 
ses considered be those which are stable, and hence would 
result in a stationary z(t) if applied to a sequence of ran- 
dom shocks, a(t). But many observed time series are inade- 
quately fitted by such an ARMA (p,q) model because they are 
non-s ta tionary in nature. Many of these may be adequately 
fitted by using a generalized autoregressive operator ^(B) 
in place of (|)pCB). The result will be an autoregressive in- 
tegrated moving average model. This is shown schematically 
in Figure lb. 
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It can be shown that the roots of the equation 



Q = <^pCB) 

= 1 + (f)^ B + <j>2 B^+...+ (f)p B^ (2.14) 
where B is now considered in the same fashion as a variable 
in a polynomial, must lie outside the unit circle for (j)^ 
to be stable. It can further be shown that if any of the 
roots lies inside the unit circle for either 

0 = (2.15) 



or 



0 = 0q(B) (2.16) 

then the ARMA(p,q) model exhibits explosive growth, thus vio- 
lent non-stationari ty . However, it turns out to be very use- 
ful in treating many types of non-s tationarity to allow one 
or more of the roots of 

* cp(B) = 0 (2.17) 

to be equal to unity. Letting the V operator symbolize this: 

V = (1 - B) (2.18) 

one can then write 9^(B) as; 

<^(B) = (J)p(B) 

= 4.p(B) (1-B)‘^ (2.19) 

where d is the number of roots of Q^(B) set equal to unity. 
Thus, the autoregressive integrated moving average model may 
be written: 

(PCB)5j. = e^(B) a^. (2.20) 

or 

(}) (B) 2^ = 6 (B) (2.21) 

t q t. 
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This model can be symbolized by ARIMA (p,d,q). Substituting 

Wj. == C2..22) 

produces a process in the which can be represented by a 

stable equivalent filter, since the process may be written 

(J)p(B)w^ = (2.23) 

which is a simple ARMA (p,q) process in the w^^s. This re- 
veals the types of non~s t a t ionar i ty which can be handled. In 
order for Equation (2.23) to be true, w(t) must be a station- 
ary series. Hence, any series z(t) which can be reduced to 
stationarity by forward differencing by the operator can 
be adequately fit by such a model. This includes any series 
whose **trend** can be adequately approximated by a polynomial 
of degree d or less. 

We note that the term integrated arises because 
-1 ^ 

V z^ = Sz^ = Z z^ (2.24) 

t t , h 

h = -00 

and, for example 

3 t t t 

S z^ = Z Z X (2.25) 

t . - h , 

sr — 00 j sr — OO OO 

Thus, in forecasting a future value of z^, say one uses 

the summation operator to go from the predicted and derived 
w^’s to the predicted 

There is still one important characteristic that a 
time series may have that is useful to be able to handle: 
seasonality. Many treatments of seasonality are possible 
within the framework exhibited so far. This discussion will 
be limited to the multiplicative technique described below. 
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Box and Jenkins [1970] should be referenced for more detailed 
information on both this and other techniques. 

The multiplicative technique is a simple extension of 
the ARIMA (p,d,q) model. The basic assumption is that the 
variation between observations separated by an increment of 
time equal to the period of the seasonal component can be 
adequately fitted by the same sort of models as used on adja- 

3 

cent observations. Hence, replacing B by B in the ARIMA 
(p,d,q) model, we get a seasonal representation: 

Vj Sj. = ©q(B®) (2.26) 



or 



or 



Wt = (<J>J^B® ‘ ^+. . .+'J>pB® ’^) 

- ( 0^ B^’^) a^+ a^. 



W 



= <I>T +. . .+ ^ + a^ 

1 t-s P t-P • s t 



- 0. w . 

1 t-s Q t-Q • s 



(2.27) 



(2 .28) 



where 

^ 2, = (1 - B^)^ 

But it is also clear that the process will also de- 
pend on adjacent values, so the complete multiplicative model, 
the ARIMA (p,d,q) x (P,D,Q) is: 

(f,p(B)$pCB^)V^ 2^ = e^(B) ©^(B®)a^ (2.29) 

This result is a very powerful class of models for fitting 
naturally occurring time series. This is the class upon 
which parametric time series analysis is based. 
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3, Transfer Function Models 



Box and Jenkins [1970J further discuss transfer func- 
tion models designed to fit cases in which two processes are 
related by 

D+...+Hj^D^) Y(t) = (Hq+H^ D+...+ H^D®)X(t-T) (2.30) 

where D E d/dt, the H*s and H’s are unknown parameters, and 
T is the dead-time or delay in the system. Where and 
are discrete time series measured at equispaced times, (2.30) 
becomes 

(l+C3^V+. . = (TiQ+nj^V+. . .+r|gV^)Xj._^ (2.31) 

where D is replaced by V S 1-B as before. Substituting 
B = 1-V, we obtain 

(1-5tB-...- 5 B^^)Y = (U)„- w,B-...-w B®)X^ ^ 

1 r t 0 1 s t-b 

or 

6(B)Y^ = w(B)b'^Xj. 

= fi(B)X^ (2.32) 

As in the ARIMA (p,d,q) case, this relation may be written in 
the form of a linear filter: 

Y^ = 6”^ (B)f2 (B)X^ = v(B)X^ (2.33) 

which is stable if the v(B) series converge for B given a 
numerical value such that 0 <; [b| 1. It is to be noted that 

the transfer function model and the stochastic ARIMA models 
are somewhat parallel. 

Now, remembering the discussion on the lack of deter- 
ministic time series, it is necessary to consider a stochastic 
transfer function model by adding superimposed noise, N^, 
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to the process, where N^. and are independent. Hence, we 
obtain 

- yCB)X^ + (2.34) 

for the stochastic model. But there is no reason to suppose 
that the process does not effect the noise in some way, or 
that the noise contribution is unstructured. Therefore, the 
idea is advanced that the noise, N^, will be represented as 
having the form 

Nj.= i|j(B)a^= <P~^(B)0(B)aj. (2.35) 

where the * s , 9^ * s , and 0’s are as before. This form includes 
unstructured noise, the raw random shocks, a^’s, as a single 
subset of the possible structures. Hence, the model one would 
be led to fit is 

v(B)X^+^(B)a^ 

= 6 ~^(B)n(B)Xj.+ 92~l(B)0(B)aj. (2.36) 

This is shown schematically in Figure Ic. 

E. APPLICATIONS TO WIND-WAVE COUPLING 

For an application of these techniques to the problem of 
detection of hypothesized organized motion in air adjacent to 
natural waves, it seems clear that a transfer function stoch- 
astic model is likely to gxye the most direct information as 
to the structure of the relationship between the waves and 
the adjacent airflow. One would fit the transfer function 
model by regarding the wave heights as inputs, , and a wind 
velocity component as the output, Y^. The existence of signi- 
ficant values for the v’s would immediately suggest that there 
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are fluctuations in the oyerlying airflow which are not random 
in relation to the waves, particularly if such values appear 
repeatedly for data from different observational periods. It 
is a fortunate circumstance that statistical tests of signifi- 
cance for these parameters exist. However, the accuracy of 
these tests depends on how closely the parent distribution of 
the actually calculated random shocks corresponds to the dis- 
tribution picked to be the theoretical parent of the random 
shocks in the model. The structure of the fitted model may 
give clues as to the nature of the actual process. It is an 
unfortunate circumstance, however, that the transfer function 
models are beyond the scope of this paper. 

The analyses performed were involved fitting ARIMA (p,d,q) 
X (P,D,Q)*s models to individual time series. Three benefits 
may be possible -from this procedure. First, models fitted 
individually to wind components and wave heights from the same 
observations may be compared for suggestions of interaction. 
This type comparison is difficult, however, because many indi- 
vidual models with differing orders in the various types of 
processes may be quite similar in result and fit, depending 
on the data. Hence, for example, one may not be able to say 
that an ARIHA Cl>d,0) x C0>0>0) represents data with a differ- 
ent structure than that of a series best fit by an ARIMA 
(0,d,2) X CO, 0,0). The latter arises because these two models 
closely correspond for certain sets of values of the para- 
meters involved. 
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The next benefit is that the fitted model may suggest in 
itself a possible structure for the process. Also possible 
and more likely is the fact that repeated analyses of this 
kind may help a researcher following an adaptive model build- 
ing strategy to infer a general model for given conditions. 
From this, he may infer some of the structure of the parent 
process. The researcher is helped in this quest by the sta- 
tistical confidence intervals it is possible to obtain for 
parameter estimates and joint confidence intervals for groups 
of parameters (with the same caution here as presented in the 
transfer function model discussion directly above). 

Third, and seemingly most important, this study may give 
an idea of the usefulness of these techniques in fitting to 
boundary layer data. The study intended to give an indication 
whether such data is of the types that this sort of analysis 
can handle, for pre-evaluation of the usefulness of further 
studies along these lines. 

A final note of interest is that Yule’s proposition [Yule, 
1927], discussed in I-D, may have a physical significance for 
wind-wave data. It may be possible to view the generalized 
turbulence of the airflow adjacent to natural waves as a 
parent random process producing a series of a^’s, and the 
waves as acting like a linear filter on these to produce 

organized motion and a resultant structuring of the noise 
elements, that is, the observed turbulence. Recent papers by 
Reynolds 11968] and Davidson and Frank [1972] suggest that 
the structure of the turbulent field is important in many 
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facets of air-sea interaction. Hence, parametric time s 
analysis may hold great potential in boundary layer stud 
However, to reiterate a previous statement, it is not ne 
sary that there exist any correspondence at all between 
actual generating process and the Yule viewpoint for thi 
of analysis to be beneficial. 
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III. ANALYSIS CONSIDERATIONS 



The data considered were from an August 19 observation 
period, described in Davidson [1970]. From these data three 
time series were chosen; wave height, u-component of the 
wind 1.5 meters above surface, and w-component of the wind at 
1.5 meters. ARIMA models were fitted to each of these series 
and examined for goodness of fit. The models were examined 
for information contained in them about the structure of the 
time series to which they were fit. 

The primary feature of parametric time series analysis 
is model fitting. An iterative approach to model building 
was used which allows the data to influence the model struc- 
ture. The steps in this procedure are discussed in Section A 
Again, a more complete discussion of this procedure is found 
in Box and Jenkins’ text [Box and Jenkins, 1970, p. 18]. 

A, MODEL BUILDING 

In fitting ARIMA models, an iterative model building pro- 
cedure is most fruitful. The steps of such a procedure are 
described below and the process is summarized in Figure 2, 

1 . Selection of General Class of Model 

The first step in model building is to postulate a 
general class of models to be fitted. By choosing to perform 
parametric time series analysis, the researcher has implicit- 
ly chosen to use the ARIMA class of models. However, further 
choices are possible at this stage. Due to the very nature 
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Figure 2. Model Building 

- after Box & Jenkins [1970] 
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of the problem, one is virtually forced to consider the 
seasonal extension to the ARHIA (p>d,q) model. 



2 . Identification of Model 

The second step in the procedure is the identification 
of a particular model or subset of models from the general 
class chosen. This process can be influenced both by outside 
knowledge and information contained in the data. 

Having chosen to consider an ARIMA (p,d,q) x (P,D,Q) 
model, theory suggests that the order parameter d should be 
zero, and D should not. The need for d/0 would imply a notice- 
able change in the mean sea level was taking place during the 
period of observation. The need for D=0 would imply that no 
seasonal change in the time series at all was taking place 
during the period of observation. Seasonal variation certain- 
ly does occur in the waves, and we expect seasonal variation 
in the air. 

It can be shown that the stable non-s ta t ionary opera- 
tor (1-B^) has zeroes at e— ^ (k=0 , 1 , . . . , s-1) . The 

application of this operator to a time series will remove a 
seasonal component of the form 



Is/2J 



seasonal component - b + I {b ^ , cos ( — J -) -fb ^ , sin } 

Oj^^lj s 2j s 



. 2tt j 



where the b’s are adaptive coefficients implicit in the data, 
and IS/2] ^ S/2 if S is even, or IS/2J = 1/2 (S-1) if S is 
odd. This type of operator is a parsimonious representation 
of seasonal components of a type which require many sine and 
cosine waves to represent them adequately. An example of 



34 



such a seasonal component would be a spike at periodic in-- 
t ervals . 

Theory predicts that the main seasonal effect will 
have the form of a single sine wave at the frequency of the 
water waves. The operator is not a parsimonious repre- 

sentation of a single sine wave. Furthermore, the adaptive 
coefficients of at least some of the additional sine and 
cosine waves contained in the operator are likely to be 
inflated by intermittent and statistically non-significant 
periodicities in the random turbulent component of the motion. 
Therefore, an operator representing a single sine wave at the 
desired frequency, adaptive in phase and amplitude to the data, 

was applied in place of the operator in some models. This 

2 

operator is [1-2 cos (2tt/p)B + B ], where p is the desired 
period. This stable non-s t a t iona ry operator, which has zeroes 
at will remove a seasonal component of the type: 

seasonal component = b- sin(^-^)+ b^ cos (-^) 

1 p 2 p 

= b ^ sin (-^ + 6 ) 

3 p 

where the b’s are adaptive coefficients, 6 is an adaptive 
phase angle, and p is the desired period. 

Parsimony is a primary consideration at this stage. 
Parsimony involves making use of the most efficient type of 
model possible. For example, while an AR(1) process can be 
theoretically represented by an infinite KA process, and by a 
finite MAfq), q > 1, process in practice, it is most parsi- 
moniously represented by the AR(1) model involving only one 
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parameter as opposed to an HAC<j) process involving cj > 1 para- 
meters. Similarly, an AR(p) model is not a parsimonious re- 
presentation of an MACI) process, nor is either an AR or MA 
model a parsimonious representation of a mixed ARKA process. 

Hence, it is necessary to go to the data for an indi- 
cation of the values of p,q,P, and Q which are best fit, 
remembering that in practice a value of two or less for each 
is normally sufficient. To get this indication, one examines 
the autocorrelation function and partial autocorrelation 
function. A brief discussion of what one looks for is to be 
found in Appendix A. Estimates of p,q,P and Q are made^ and 
rough estimates of the fitted parameters, the ’ s , $’s, 0’s, 

and ®’s, are made. 

3 . Estimation 

The third' stage of the procedure is to fit the model 
to the data. The rough estimates of the fitted parameters are 
now used as initial guesses for iterative estimation routines. 
At this stage, the best fit for the model chosen is obtained. 

4 . Diagnostic Checks 

The final stage is diagnostic checking to determine if 
the fitted model is adequate. If it is, the procedure ends, 
and the model is chosen. If the model is determined to be in- 
adequate or unsatisfactory, the process is iterated from the 
second step until a suitable model is either found or deter- 
mined to be non-existent. 

A more complete discussion of the diagnostic tools 
used is to be found in Appendix B. 
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B. PRELIMINARY ANALYSES 



The preliminary analysis produced estimated autocorrela- 
tion function values, estimated autocovariance function values, 
and estimated partial autocorrelation function values of the 
time series which resulted from the application of the 
operator, with various values of d and D, and appropriate 
values of s, to the initial data. These estimated function 
values were also calculated for the time series resulting when 
was replaced by [1-2 cos (2tt/p)B + B^], the less general 
operator discussed in A-2. From these estimated function 
values, the proper types and degrees of differencing needed 
to produce a stationary time series were estimated. Further 
analysis of the estimated function values for the properly 
differenced series resulted in estimates for the proper values 
of the parameters p,q,P, and Q, for the multiplicative ARIMA 
(P>d,q) X (P,D,Q) model to be fit. The set of the most likely 
model candidates was the result. 

C. ANALYSES 

The preliminary analysis yielded the set of the most like- 
ly model candidates; the parameters required by these models, 
the (f) ’ s , $’s, etc., were estimated by mapping the maximum 

likelihood surface of the parameters, as reflected by the sum 
of the squares of the estimated (that is, the sum of the 

squares of the residuals). Thus, -these estimates were ”least 
squares" estimates, and this method is an accepted routine 
for non-linear least squares estimation. Diagnostic checks 
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were then applied to the resulting from the use of the 

i 

best estimates for the parameters in each model. These diag- 

V 

nostlc checks included comparing the magnitudes of the sums 
of the squares of the residuals of different models, calculat 
ing the autocorrelations of the residual series, and calculat 
ing a chi-square test for adequacy of fit [Box and Jenkins, 
1970, p. 291]. These diagnostic checks were then compared in 
an effort to determine the best model of those examined. 

Unfortunately, the author was unable to get a Marquardt 
maximum descent nonlinear least squares algorithm to work. 

Had this been done, reasonably accurate confidence intervals 
and joint confidence intervals for parameter estimates would 
have been easily available in addition to the speedier arri- 
val at the least squares estimates. As this was not done, 
alternate methods for obtaining these estimates, set forth 
in Box and Jenkins [1970, p. 228], were explored. Consider- 
ing the lack of real need for such interval estimates in this 
study, due to the small number of time series analyzed and 
subsequent preclusion of a meaningful attempt at an adaptive 
model building strategy, these methods were found to be too 
inefficient and time consuming to be of sufficient relative 
value to attempt. Hence, although no confidence intervals 
were estimated, their exclusion was not significant in this 
s t udy . 

In summary, the estimated autocorrelation functions, 
estimated autocovariance functions, and estimated partial 
autocorrelation functions of the time series resulting from 
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application of the difference .operators to the original time 
series were used to identify the proper degree of differenc- 
ing and the most likely types of models to be fit to the pro- 
perly differenced series. The parameters of these models were 
estimated by a nonlinear least squares mapping technique. The 
residuals, the resulting from each type of model when the 

least squares estimates for the model’s parameters were used, 
were tested for indications of the model’s goodness of fit. 

The results of these tests were compared, and the best model 
was chosen. 
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IV. RESULTS 



The results of the pajrametr ic tijae series analyses were 
in good agreement with the spectral analyses done with the 
same data by Davidson 11970]. The results clearly indicate 
the tractability of this type of data to this type of analy- 
sis. 

A. PERIODIC EFFECTS 

The difference operator was applied to the data for 

several values of d and D. Table I contains the values ap- 
plied. From both spectral analysis and autocorrelation func- 
tion estimates, it was determined that the dominant period of 
the waves was best estimated as six seconds, corresponding to 
a period of 30 lags in the time series, while the air was 
observed to have a dominant period of only 4.7 seconds, cor- 
responding to 24 lags in the time series, in both the hori- 
zontal and vertical components. The estimated autocorrelation 
functions of all three time series before differencing. Figure 
3, shows the form of a damped sine wave, but the damping is 
not great. The autocorrelations at high lag numbers do not 
fall off sufficiently fast to support an assumption that 
these autocorrelation functions have arisen from stationary 
time series. Tt can be seen in Figure 4 that application of 

the operator did not result in a picture more compatible 
s 

with the assumption that stationarity resulted. None of the 
dif f erencings involving the operator were satisfactory 
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TABLE 1 

DEGREES OF DIFFERENCING EXAMINED 
V*^V^z or 1 1-2 cos (2 it/p ) B+ B^]z 

St t 

p=s=24 for Airflow; p=s=30 for Wave heights 

d D S [1-2cos(2it/p) B+B^ ] 
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No 


No 


1 


0 


No 


No 
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0 


No 


No 


0 


1 


24/30 


No 


0 • 
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No 
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24/30 


No 
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2 


24/30 


No 


2 


1 


24/30 


No 


2 


2 


24/30 


No 


0 


0 


No 


Yes 


1 


0 


No 


Yes 
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0 


No 


Yes 


0 


1 
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Yes 
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when D ^ 0, Applying the alone, with d = 2 , resulted in a 

satisfactory pattern for this data CFignre 5). However, the 

2 

difference between the V operator and the non-general opera- 

2 . 

tor 11-2 cos (27t/p)B + B J, discussed in III-A-2, is small 

for p=24 for the wind, and p=30 for the waves. The [1-2 cos 
2 

(27t/p)+ B ] operator gives a slightly better result (Figure 6) 

and with the support from theory, the assumption that this 

operator is the proper one seems justified. No value d ^ 0 

gave a better result than d = 0 when the full [ 1-2 co s ( 2tt/p ) B 
2 

+ B ] operator was applied, so the final series considered was 

w^ = [1-2 cos (2tt/p) B + B^] 

The result that no local differencing was necessary (that 
is, for V^, d=0) , and that the best results arose from the 

application of the non-general operator which took out a 
single sine wave, is precisely in agreement with the hypo- 
thesis concerning the organized motion considered in this 
paper. The fact that the frequency of the organized motion 
in the air is not the same as the dominant frequency of the 
waves is not in good agreement with the hypothesis considered, 
though it is in agreement with Davidson's spectral results 
[Davidson, 1970J. The lack of efficiency in the operator 

for production of stationarity was judged to be due to 
spurious periodicities in the random turbulent components of 
the series. 
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B. ARMA MODEL FITTING 



Once a satisfactorily stationary time se?:ies vas produced, 
a closer examination of the estimated autocorrelation and es- 
timated partial autocorrelation functions revealed that these 
functions showed unmistakable evidence that a mixed model, 
that is, one requiring both autoregressive and moving average 
terms, was required for each series. However, the three ser- 
ies each showed different structure, and were considered 
separately . 

1 • Wave Model 

The estimated autocorrelation function of the waves 
was most easily diagnosed for indications of the proper model. 
The pattern exhibited is that of an ARMA (p,q) x (P,Q) where 
p=0, q=l, F=l, Q=l. The pattern was quite distinct, and over- 
fitting, by -letting p=l, q=2, P=2 , q=2, resulted in estimates 

for the (J)^, 02* ^2 which were close to 

zero, while no significant decrease in the sums of the squares 
of the residuals was achieved. Fitting a smaller model, one 
with fewer parameters, resulted in a significant increase in 
the sum of the squares of the residuals. Hence, an ARMA (0,1) 
X (1,1) was considered best, and the final fitted model was 

where + 0.11, 9^^ --0.53, and = -0.33. 

2 , U-Component Model 

The estimated autocorrelation function of the u-com- 
ponent series did not immediately correspond to a particular 
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model in the way that the waves’ function did. No particular 
need for a seasonal component to the model, other than differ- 
encing, was suggested. Hence, P and Q were initially taken as 
zero, and subsequent fitting of ARHA Cp^q) models showed that 
the best model of this type was the ARMA The final 

fitted model of this type was, after overfitting in both local 
and seasonal parameters, 

(l-(J)^B)w^ = (l-e^B)a^ 

where c})^^ = -0.31, and 6^ = -0.74. 

It was interesting to note that the estimated partial 
autocorrelation showed large values at periodic intervals. 
There seemed to be two periods involved, one of 24 and the 
other of 31. This is a possible indication that a seasonal 
autoregressive operator might be useful. Two such models 
were tried, one with s=24 and one with s=31: 

= (l-e^B)a^ 

and 

(l-(J)^B) (l-$^B^^)Wj. = (l-e^B)a^ 

Each of these models produced improvements in the diagnostic 
checks of the a^’s, but the improvement was slight enough to 
make it questionable if these are parsimonious operators. 

3 . W-Component Model 

The estimated autocorrelation function and the esti- 
mated partial autocorrelation func-tion of the w-component 
series showed no indication that seasonal parameters other 
than differencing were required. After overfitting in both 
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local and seasonal parameters, the ARHA (1,1) x (0,0) was 
judged best. The final model was 

Cl-(})^B)w^ = Cl-e^B) 

where ” -0.09 and 0^ = -0.87. 

C. SUMMARY 

All three time series were most adequately reduced to 
stationarity by application of a difference operator which 
removed a single sine wave, with adaptive amplitude and phase 
angle, from the data. The period of the motion in the air 
thus removed differed from that of the waves, the period in 
the air being A. 7 seconds, as opposed to the 6.0 second 
dominant period in the waves. 

Once reduced to stationarity, the wave series was best 
represented by an ARMA (0,1) x (1,1) model The stationary 
series from the u-component was adequately represented by 
either an ARMA (1,1) x (1,0) or an ARMA (1,1) model. The w~ 
component was best represented by an ARMA (1,1) model. 

The data proved to be of the type for which parametric 
time series analysis is useful. The results show good agree- 
ment with previous results obtained by spectral analysis of 
the same data by Davidson 11970]. 
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V. CONCLUSIONS 



The necessity for removing a single sine wave from the 
data to achieve stability strongly suggests that there was a 
significant sinusoidal component of organized motion in the 
air flov; immediately over the waves during the period in which 
the data were gathered. That this sinusoidal motion has a 
dominant period which is different than that of the waves is 
not supportive of the theories which predict that the air 
will have organized motion with the same period as the waves. 
However, it is noted that the periods in question differ only 
by 1.3 seconds, although this appears significant in this 
densely sampled time series, which had about 25 observations 
per period (observations every .2 seconds). 

The fact that none of the series reduced to a random walk 
model , that is , 

Zj. = a^, 

after being differenced to s tationar ity , suggests that either 
further organized motion exists, or that some sort of struc- 
ture is imposed upon the turbulence. Due to the fact that 
only one series of each type was considered, the significance 
of the ARMA models which were fitted to the properly differ- 
enced series is unknown* Too few series were considered to 
be able to generalize about the structure of the turbulent 
flow from the ARMA models derived here. The possibility 
clearly exists that the repeated fitting of such models may 
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result in a formulation of a general model by an adaptive 
strategy . 

The most important conclusion of this exploratory ivork 
is that parametric time series analysis may be applied to 
wind-wave data with a reasonable expectation that results of 
value can be obtained. It is noted in this respect that 
this analysis closely agreed with the traditional spectral 
approach in their region of overlap. The results in this 
study show that the examination of the estimated autocorrela- 
tion and partial autocorrelation functions of various degrees 
of differencing of the original series may be of value in 
understanding the structure of such series, even if the actual 
fitting of the ARMA model to the resulting series is not 
attempted. In short, the parametric approach appears to hold 
considerable potential in the analysis of organized motion 
within turbulent boundary layer data. 
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APPENDIX A 



Model Identification Considerations 

The initial identification of the proper form of the ARIHA 
to best fit the data is based on examination of the estimated 
autocorrelation function and the estimated partial autocorre- 
lation function of the series. There are two distinct steps 
involved: identifying the degree of differencing necessary and 

identifying the resultant ARMA model. These steps are des- 
cribed briefly below; they are more fully developed in Box 
and Jenkins [1970, pp . 174-207; 300-334], 

The first step in initial model identification is deter- 
mining the degree of differencing necessary to produce a 
stationary time series from the raw data. Thus one is attempt- 
ing to estimate the proper values of D and d in the 
operator. The distinctive characteristic of a non-s tat ionary 
time series for identification purposes is the fact that the 
autocorrelation function does not go quickly to zero with 
increasing lag number. The autocorrelations may decrease 
with lag number, but the decrease will be closer to linear 
than the exponential damping expected with a stationary time 
series. Further, recurrent patterns spaced a fixed distance 
in time (at a fixed number of lags apart) may indicate the 
need for seasonal differencing, if the autocorrelation co- 
efficients do not fall off sufficiently fast. Hence, one 
examines first the estimated autocorrelations of the original 
data, and then successive degrees of differencing, both 
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seasonal and local, until a pattern consistent with a sta- 
tionary series is found. 

The second step is to ejcamine the autocorrelations aris- 
ing from the properly differenced series for indications of 
the resulting ARHA model to fit to the differenced series. 

One is aided in this by the knowledge of the behavior of the 
autocorrelation and partial autocorrelation functions for 
various models. The theoretical autocorrelation function of 
an AR(p) model tails off and the partial autocorrelation 
function abruptly goes to zero after p lags. The theoretical 
autocorrelation function of a MA(q) process goes abruptly to 
zero after q lags, while the partial autocorrelation function 
tails off. Both functions tail off in a mixed model. The 
estimated functions will generally follow the theoretical 
functions. However, the estimates have large variances, and 
the match need not be close. In general, it is possible to 
produce an aut o covariance generating function for any given 
model. This function can be used to show the form of the 
theoretical autocorrelation for any model. One can then com- 
pare the estimated au t o covar ian ce (or autocorrelation) func- 
tion to the theoretical possibilities to find a reasonable 
match . 

In summary, the estimated autocorrelation and partial 
autocorrelation functions are examined for characteristic 
patterns which indicate the proper degree of differencing 
and the proper ARHA model (to be fit to the properly differ- 
enced series). Due to the variability of such estimated 
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s 



functions, precise definition of model is often 
and the skill and experience of the analyst may 
cant in obtaining the best results. 



impossible , 
be signifi- 
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APPENDIX B 



Diagnostic Checks of Model Adequacy 

In order to insure that the model which has been fit is 
adequate, it is useful to have available diagnostic procedures 
for testing the adequacy of a given model. Such tests may be 
based on both evaluations at the total model level and at the 
level of the residuals alone. Several diagnostic tests which 
were used are discussed below; these tests, and others, are 
more fully described in Box and Jenkins [1970, pp. 287 ff]. 

A primary diagnostic check on the total model level is 
the comparison of the magnitudes of the sums of the squares 
of the residuals between models. A significant decrease in 
this value with a change in model indicates that the original 
model was not adequate. A second check on the whole model 
level is overfitting of parameters. If one judges, for exam- 
ple, that an ARIMA (p,d,q) is adequate while noting that if 
it were inadequate, it would likely be inadequate in the AR 
parameter, then one might fit an ARIMA (p+l,d,q) model. One 
could then compare results of original model and the over- 
fitted model for indications of the original model’s adequacy. 
That is, if no significant change in fit occurred, as would 
be the case if the best estimates of the extra parameters in 
the overfit model were all close to zero, then one is re- 
assured as to the adequacy of the model. 

To overcome the need for the prior knowledge of the sus- 
pected deficiencies in the model, such prior knowledge being 
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implied in overfitting, it is possible to use diagnostic 
checks on the residuals generated by the model. The auto- 
correlation function, or equivalently the spectrum, of the 
residuals can be examined for indications that the residuals 
generated do not approximate white noise. Individual auto- 
correlations may be tested for significant deviation from a 
value of zero by Student-t tests, using the appropriate limits 
for estimated residuals rather than those for white noise. 
Finally, the first K autocorrelations of the residuals may 
be looked at as a whole. The value Q such that 

^ 2 

Q = N Z (a), 

m=l 

2 

where r (a) is the residual autocorrelation coefficient at 
m 

lag m, and N = number of observations less d, the difference 

2 , 

ing parameter, is distributed as X (K-p-q-P-Q-M) . Here, the 

p,q,P, and Q are as before, and M=1 if the mean were removed 

from the initial series before the analysis, and M=0 otherwise. 

2 

Hence, the final test discussed here is an X -goodness of fit 
test. 

In summary, tests may be made either between models or 
within models to determine the adequacy of models. Both types 
of tests were used in the analysis presented in this paper. 
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